Genetic Programming for Symbolic Regression
نویسنده
چکیده
Genetic programming (GP) is a supervised learning method motivated by an analogy to biological evolution. GP creates successor hypotheses by repeatedly mutating and crossovering parts of the current best hypotheses, with expectation to find a good solution in the evolution process. In this report, the task to be performed was a symbolic regression problem, which is to find the symbolic function that matches a given set of data as closely as possible. By training the genetic programming algorithm with the given data set, the relationship between the input and output was represented by functions generated in the training process. If the error rate reached a certain threshold, the training can be stopped and the testing can be applied to verify the effectiveness of the best function. Observations I made during the experiments are based on two aspects of the GP: search space and variety of generations. Firstly, the function and terminal sets had a powerful impact on the performance. Too many functions and terminals increased the search space exponentially, resulting in a long training times and more local minimums, so that the algorithm was more likely to be trapped by some suboptimal solutions. On the other hand, too few functions failed in thoughtfully representing the relationship of given dat, so that failed in finding a fitting function. Secondly, the depth of trees can influent the search space. Constraint of the maximum depth of trees in the training helped find good functions more quickly. Thirdly, too low variety made the solution converge to some local minimums. The size of population, the selection method and selection percentage are main factors affect the variety of generations. With finely tuned parameters, the genetic programming applied in this project can find a best function with error rate of 0.143 within 60 generations.
منابع مشابه
Shuffled Frog-Leaping Programming for Solving Regression Problems
There are various automatic programming models inspired by evolutionary computation techniques. Due to the importance of devising an automatic mechanism to explore the complicated search space of mathematical problems where numerical methods fails, evolutionary computations are widely studied and applied to solve real world problems. One of the famous algorithm in optimization problem is shuffl...
متن کاملUsing Symbolic Regression to Infer Strategies from Experimental Data
We propose the use of a new techinque{symbolic regression{as a method for inferring the strategies that are being played by subjects in economic decision making experiments. We begin by describing symbolic regression and our implementation of this technique using genetic programming. We provide a brief overview of how our algorithm works and how it can be used to uncover simple data generating ...
متن کاملMetamodeling by symbolic regression and Pareto simulated annealing
The subject of this paper is a new approach to symbolic regression. Other publications on symbolic regression use genetic programming. This paper describes an alternative method based on Pareto simulated annealing. Our method is based on linear regression for the estimation of constants. Interval arithmetic is applied to ensure the consistency of a model. To prevent overfitting, we merit a mode...
متن کاملStepwise Adaptation of Weights for Symbolic Regression with Genetic Programming
In this paper we continue study on the Stepwise Adaptation of Weights (saw) technique. Previous studies on constraint satisfaction and data classification have indicated that saw is a promising technique to boost the performance of evolutionary algorithms. Here we use saw to boost performance of a genetic programming algorithm on simple symbolic regression problems. We measure the performance o...
متن کاملGlyph: Symbolic Regression Tools
We present Glyph - a Python package for genetic programming based symbolic regression. Glyph is designed for usage let by numerical simulations let by real world experiments. For experimentalists, glyph-remote provides a separation of tasks: a ZeroMQ interface splits the genetic programming optimization task from the evaluation of an experimental (or numerical) run. Glyph can be accessed at htt...
متن کامل